Tag
8 articles
Moonbounce raises $12 million to scale its AI control engine that translates content moderation policies into consistent AI behavior. The startup, founded by a former Facebook insider, aims to solve the challenge of maintaining predictable AI governance as AI systems take on more content moderation responsibilities.
Learn to build a real-time hate speech detection system using Python and transformer models, similar to Penemue's AI platform for identifying online hate and digital violence.
Learn to build an AI content analysis tool that can detect potentially problematic language in chatbot responses, similar to the legal issues surrounding Grok and Swiss former president Karin Keller-Sutter.
This article explains how automated AI systems for content moderation can produce erroneous takedown notices, examining the technical architecture, trade-offs, and legal implications of such systems.
Meta's Oversight Board warns that Community Notes are ill-equipped to handle the growing threat of AI-generated disinformation, especially in vulnerable regions.
OpenAI has discontinued its erotic mode for ChatGPT, following a pattern of abandoning experimental features amid regulatory pressure and ethical concerns. The move reflects the company's strategic shift toward prioritizing safety and responsible development.
Meta has launched new AI content enforcement systems to improve platform safety and reduce reliance on third-party vendors. The company claims these tools will detect more violations with greater accuracy and respond more quickly to real-world events.
Meta's deepfake detection methods are insufficient for handling misinformation during armed conflicts, according to its own Oversight Board. The board is calling for a major overhaul of how the company identifies and surfaces deepfake content.